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SECTION I 


General 

This report discus see bosks preliminary ideas and assumptions 
which have evolved during Biases I sod XI of the 23^1 systems study. 

As such it evidences a developing picture of the overall system beyond 
that presented in the back-up paper to the Phase I outline report. 

Section 1 discusses the general system capabilities as presently 
conceived. 

Section 2 is the General Development Plan and seme general ob- 
servations and assumptions. 

Section 3 presents the Perforaance Specifications and a msrfber 
of working esmaaptions in regard to performance objectives. 

In Section 4 the implication of all of the above capabilities, 
objectives and assumptions In the resulting system configuration are 
discussed. 

While much of shat is said here is subject to revision and change, 
it is being furnished now to members of the CHIVE Evaluation Group in 
the hope that it win contribute to a better uaderstaading of the 
CHIVE proposal. 
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1. General Systems Capabilities 

"Ehia section wvwrs the major ftmctlcoa that the system 
should or might perform. The capabilities ore based oo iafonastion 
gathered during the Fact Finding phase of the project and the fire con- 
elusions drawn therefrom. Also included are some capabilities not yet 
made firm goals by management, but which might be required. Therefore, 
thiB section covers the m aximum capabilities that will be expected of 
the system with the uirferstanding that say of them may b© relaxed. 

1.1 Document Handling 


The general requirement is to be Able to “handle” all 
documents la use by the analytic offices which this system is to serve. 
Some relatively minor exceptions must be made, such as bound boohs for 
general use, certain graphic material, etc. For all material, the 
system must provide for the document to be indexed, both the index and 
the document (and/or a micro image of it) to be stored, and for retrieval 
of the index, the document, or a copy of it v®cm *«■««»* of an analyst 
or other user* Users must be able to request documents in terra® mean- 
ingful to themselves although this does not exclude use of intermediary 
personnel who are specialists in file querying. 

1.2 Information 

The system must also be able to handle factual data files 
which represent the final or interaediate restate of analysis, Examples 
of such files are biographic summary files (as opposed to dossiers 
which are collections of documents), equipment characteristics files, 
gazetteers, etc. The principal differences between such inf Great ion 
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files aocwasnt files sere that information files: (a) ere usually 
not written in natural languages— rather they are formatted, conveying 
Information to the reader hy virtue of position of words on the page 
as veil as content, (h) are not copies of documents— they are end pro- 
ducts— they contain "answers" to questions, sad (c) may he searched 
directly hy computers whereas document files are usually searched only 
through the uwdlum of an index file. 

"Handling" information files implies providing facilities 
to users to permit creation of new files, erasure of old files, changing 
of files, storage of files:, adding inibrsmtlan to files, retrieving 
information on a selective basis, or retrieving an entire file* 

1, 3 Centralised System Control 

In order to lnprove service to users, the system should 
allow them to query all Information in the system from any query point, 

A query point is the physical location at which sat analyst presents his 
information needs directly to the system or to a specialist in querying. 
This oapc&ility speeds the analyst’s search for iafOimatioo, reduces 
the ammmi of time and effort he must expend, sad raises hi© confidence 
in the system. The implication to the system is that all material to 
be handled, regardless of source or classification, be indexed in an 
Inte rnally consistent manner, that a centralised index be maintained 
or that decentralised information stores be available to all query 
points, and that any physical access point be able to provide the same 
information to an analyst. (20r security reasons, some access points 
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sjay aot be able to proride delivery fetffc the analyst there should be 
dole to direct searching of an all-source file. ) 

1.4 Specialised Files 

Mr*py individual e&siyftl or analytic orgaai satlosss prefer 
to be able to maint ain their cwn file*, la addition to the oeurtralised 
files, The reasons are ^amorally two: (a) faster access* (b) eaalyate 
are net always sure enough of the tafoanaatlcm to vast to submit it to 
a central point. The system should have the capability to permit such 
in divid ual preference without prejudicing overall performance* For 
axacgle, there is no reason why an individual may not retain a document 
ia Ms dealt ®o long as a copy is also in the central system, This 
gives everyone access to it. Information files can fee similarly treated. 
Analysts may retain personal informatics* files, way ale® ebon thaw in 
the system where the computer can fee used far statistical analysis 
of the films or other calculations, or the analytic office may choocc 
to enter an ’’official" version of a file for general use but retain 
a ’’hunch* file in their own office* It is also possible to store an 
information file In the system but deny access to all but the proper 
analytic office. 

1 . 5 Document Dissemination 

The system should have the capability to determine the dis- 
semination of iaewahag documents cm the feafils of the do eu wt e t index 

of analyst s' statements of interest. In establishing this require- 
ment* it la understood that laplmaantafcion of automatic dissemination 


- 4 - 


Approved For Release 2002/05/06 : CIA-RDP78-03940A000200010012-1 



Approved For Release 2002/05/06 : CIA-RDP78-03940A000200010012-1 


soar tie iss&e c» a piecemeal basis, that la, docwwat class by document 
class sod organisation by organization. 

1*6 System Response Simas 

The time required by the sysla® to respond to a query, to 
disseminate mm infbaraation, or to sot re new tBforaistioa la a file 
should be each as not to delay analytic offices la the performance of 
their tas&fl. That Is, analyst schedules should not be h eld up by 
delay® la posting, disseminating, or retrieving information. In the 
case of queries etzbmitted fey analysts involved la long term research 
projects, this requirement places little pressure cm the system. In* 
the case of analysts engaged in analyst* of rapidly changing, day-to- 
day situation*, this is a very significant requirement. 

2 , General Development Plan 

The project plan as presently established for a four- 
phase development; 

.ftjase I; System Beguigagaats Study . 

This effort {-which led to the System Capabilities aJK>vn in 
Section I) studied end evaluated user requirements for m 
IB System, mm3 developed a planned program for desist and 
implementation of such a system* This study was cocpleted 
in June, 1963, 
gfeBtse II; System Beslan. 

This effort vas begun in July, 1963 and calls for the design 
of the If? system. 

Approved For Release 2002/05/06 : CIA-RDP78-03940A000200010012-1 


Approved For Release 2002/05/06 : CIA-RDP78-03940A000200010012-1 


Phase IH: Initial System Implementation , 

'Ibis effort is planned to 3 tart in July 1964 for a period 
of about one year and will result in the implementation of 
the Initial segment of the full system. 

Phase IV; Expansion to Full 9ygtea . 

This effort will be concerned with gradual expansion of the 
system implemented in Phase III to cover the full input load. 
2.1 Initial System Objectives 

The baale objective of the Initial system is to establish 
a small scale Mechanical structure of the eventual system in a limited, 
controlled environment . This system will be designed with the per- 
formance specifications of the full system In mind for application to 
a limited area. One of the basic premises of the Initial System, how- 
ever, will be that expansion in terns of added sources and increased 
performance will not require substantial redesign of pro^aas, methods 


of operation or equipment configuration. This will require that the 
initial system have as many features of the full system os possible, 
subject to economic justification. 

Role of tbs Contractor 

of 3 June 1963 and agreedSTATINTL 


t. ■ t— 


As recommended in 


to in the contract dated July 1, 1963 , the contractor is to assist the 
government in the System Design effort by the performance of certain 
stated tasks . 


These tasks are greased under two major areas. Systems 
Engineering and Program Design a ad are described in detail in the contract. 
IXiring the performance of these tasks the contractor 1ms 
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developed the following two sections ( 3 and 4) as & partial result of 
our efforts to date. These sect loos are Intended to present per- 
formance specifications and certain equipment implications which 
necessarily result from the adoption of these specifications as wmli* 
ing hypotheses. 

2, 3 System Cost Considerations 

It is well to note at this point, that as yet no masctsaim 
ult imat e system cost has been established. However, so inference has 

been made that such a ceiling does not exist. 

The design group has been using as a working assumption that 

development of such a ceiling would be primarily based on a "cost-for- 
total -capability” basis rather then cm such other measures as a H eost- 
per-document indexed" or " cost-per-query-answered. " 
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3* Performance Spec if ieat ions 

3.1 Document Input 

3.1.1 Volume 

Initial System — the initial aye tern as defined in Section 

2 will he an implementation of the full system concept on & limited 
volume of document input. Although the actual method for limiting 
this volume is still under consideration by Agency iaaiustgement , a 
working assumption van sade that the volume of documents to be con- 
aider ed for input in one year would be 60 , 000 . 

Full System — the full system should control current 
doomentary input. The estimate of yearly document input volume Is 
1 , 000 , 000 . 

Growth — in order to perform the system engineering and 
program design tasks , a range of growth rates from initial to full 
system were assumed, (Refer to Figure 3-1.) It is felt that the 
earliest possible date of the initial system implementation is January, 
1965 or six months following the completion of the Phase II system 
design. The earliest possible date for Implementation of the full 
system is considered to be July, 1968, The latest dates that are 
reasonable to consider for implementation of the initial and full sys- 
tems are assumed to be July, 1966 and July, 197 ^ respectively. 

Again, for planning purposes , a linear growth was assumed 
from the document rate for the initial system to the document rate of 
the full system, and the earliest and latest growth curves ware plotted 
according to the range of implementation dates stated above. The area 
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bounded by the tvo curves represents the range of growth rates which 
are being considered. 

3.1.2 Types 

All types of documents that are now input to the user 
organisation are being considered as Input to the system. It is real- 
ized that there are variations in respect to such things as security 
classification, inclusion of pictures and graphics, quality of print, 
reliability of information content, machine readability, etc. 

3.2 Document Indexing 

3.2,1 Methods 

The traditional methods of indexing may be roughly classi- 
fied as follows} 

All Symbolic — all the index entries are completely 
coded fixed field terms, ISC codes and modifiers. 

Controlled Keyword — index entries contain keywords but 
only those appearing in a prepared thesaurus. 

Free Keyword — any keywords may be used in the index, 

Free Keywords with Links — same as the Free Keyword sys- 
tem except that linkage 1® permitted between keywords to indicate a 
phrase or relationship between words . 

Fhraaea — full phrases from the documents are used as the 
Index so that to some degree the context In which a keyword was vised 
may be determined cm retrieval of the phrase. 

Abstract — a concordance of the abstract is used as the 
index giving additional ability to determine context. 
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Fall Text — a concordance of the entire text la the index 
to a document . Ability to determine context is mwcimised, 

3.2,2 Indexing Rates, Personnel, Coat 

The indexing rate will vary with the chosen indexing method. 
Assuming either the full keyword with linking or the phrase methods, 
the indexing rate is estimated at approximately 30 minutes per docu- 
ment per indexer. Of this time about 20 minutes will be devoted to 
reading and the remaining 10 will be devoted to indexing operations 
such as keyword selection and transcription. 

The number of indexing personnel to support the proposed 
system grows from shout 20 for the initial system to 300 for tins full 
system. Again, as in document input growth, the build up of personnel 
may be rapid or slow depending on the implementation dates chosen as 
goals. 

Manpower costs for indexing are assumed to be $7,000 per 
year per indexer. This is the only cost which will be considered as 
port of the indexing operation. Transcription costs will ba covered 
as part of the input operation. 

3,3 Information Files 

2.2.1 Authority Files 

This includes the dictionaries, name files, organisation 
subordination files, etc. There is an assumed requirement for mechan- 
ised storage of these files. They should be changed or updated immed- 
iately when necessary and be available to indexers and analysts for 
queries. In addition, there is an anticipated requirement for periodic 
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printouts to be used at the remote indexer and analyst offices , 

3.3.2 ?hct Files 

These files contain a unwary, factual information, each 
ltm of which is identified by labeling (tagging) or by its position 
in the files. They are not considered to be directly related to docu- 
ments. The information for these files may be derived from analysis 
of documents which have been retrieved from the system — but analysis 
is essential to forming fact files. These fact files are considered a 
necessary part of the system In order to respond to queries for informa- 
tion and in order to aid in the process leg of queries for documents. 

3.3.3 Site of Files 

The exact format and nuatoer of these information files is 
not known at this time. It is assumed for planning purposes that the 
information files consume approximately 1/3 of the storage space con- 
sumed by the document index and inverted index (term) files. 

3*k File Changes 

There exists a requirement to be able to change any of 
the system files. Such changes are necessitated by: 

1) Errors — These may be transcription errors , spell- 
ing errors, or indexer /analyst errors resulting in 
information, entering the system which nay be in error, 

2) Changing environment — ’The information files, in 
particular, are dynamic in nature. The attributes and 
the entries change rapidly. Such things as organizational 
changes, occupational changes, travel, etc. must be 
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reflected in the information file*. 

3) Sew information — Pile entries are often made with 
a very Halted aKiount of information. As store information 
is gathered it is often the case that documents have 
bees wrongly classified or false entries have been made 
to infestation files, 

3*5 Query Types 

3*5*1 Information Qaery 

Queries to the information files may be made by indexers 
or, ultimately, by analysts. The queries say be aimed at retrieving 
information to aid them in the performance of their duties or to aid 
in the fornulation of information or document- queries or index records , 
The latter functions are referred to m conversational querying. An 
indexer should have the capability of asking short information retriev- 
al questions in order to: 

1* Determine how a subject has been indexed before, 

2. Identify a word or phrase, and 
3* Determine what is already on file to avoid redund- 
ant entry. 

The analyst way not know how to phrase his question to yield the 
desired result from the system. He should have the capability of ask- 
ing a series of short questions , each question being predicated on 
the response to the last question. Thus, he may improve hie original 
query avoiding retrieval of irrelevant material or non-retrieval of 
relevant material, or both. 
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3.5*2 Document Requests 

Handling of queries resulting in retrieval of doeuExmts 
is a definite requirement of this ays tea. Document queries may result 
in many pages of document output. In order to reduce the analyst read- 
ing problems and Machine processing time soon intermediate outputs may 
he desired j 

1* -Ph® doc umen t index record may he delivered in response 
to tins document query. The analyst nay then use the 
index as an abstract of the document and accept or 
reject on this basis. 

2 . Microimages may be transmitted to the analyst fear 
viewing. Thus, another screening my take place 
before ordering hard copy, 

3*5.3 Pile Interaction 

Queries should make use of all information available in 
the system regardless of the basic file being queried. For instance, 
a document query raay be expressed as the intersection of a nuud er of 
terms. If any of the terms was not in the document index, the docu- 
ment would not normally be retrieved. However, if additional JBfor- 
ration can be supplied by the information files for document query, 
the document may be found to answer the question. 

3-5. ^ Cyclic Queries 

Queries should be capable of initiating a series of search- 
es «*ch yielding an intermediate output which Is used as input to the 
next search, Thus, with a very limited amount of information, the 
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analyst gay initiate a complex query, 

3.6 Query Rates 

3.6.1 Information Queries 

Requests for retrieval from information files may occur at 
a rate as high as 3?0Q0,,000 per year. . These queries may be from index- 
ers or analysts, l>ut so high a figure would only be reached by penal tt- 
ing and encouraging Indexers to query while Indexing. 

3.6.2 Document Requests 

Queries resulting In document retrieval should occur at 
an approximate rate of 100,000 per year. Saeh query *aay result in the 
retrieval of a number of documents. Thus, this figure is not the same 
as the number of individual image retrievals. 

3«7 Response Time 

3.7,1 Information Queries 

Requests for Information should be answered In one to two 
minutes. In using the conversational mode, it is important that once 
the first query has been entered the responses and additional queries 
proceed rapidly and without interrupt. 

3*7.2 Document Queries 

lies pons e time required fear document queries may be split 
into two categories, less than 15 minutes, and eight hours or sere. 

The first category is assumed to be the requirement of about 2 % of 
the document queries. The remaining 1 % will have the lass dema&tm 
requirement . 


-15- 


Approved For Release 2002/05/06 : CIA-RDP78-03940A000200010012-1 



Approved For Release 2002/05/06 : CIA-RDP78-03940A000200010012-1 


h,l Input 

*►.1.1 Batry of Document Indexes 

I**P*t, In the context of this Section, refers to digital 
data t© he stored in infonaation or index files. Two special problems 
are created by the requirement to handle one million documents a year. 
First, the cost of transcribing indexes of these documents, by traditional 
man*, ia excessive, The very number of people— keypunch and verifier 
operators with their supervisors and adainlstrative support — presents a 
lar®» administrative problem. Second, error control, in so large an 
organization will be very difficult and conceivably could be such as to 
totally block successful use of the system. That is, unless an effective 
means for detecting and quickly correcting errors (e.g,, misspelling a 
nma in index record) can be found, file quality can quickly become 
degraded and the proportion of effort spent by both people and machines 
ia reacting to errors can become excessive. 

To attack these problems, we have looked at the basic nature 
of the indexing transcribing function, Por an indexer to transcribe hla 
data onto a data form, using careful block lettering, takes about as much 
time as typing the same information . If, however, the data were trans- 
cribed using a tape -generating or remote-input typewriter, the keypunch 
operation could be avoided with a sizeable savings ia input preparation 
cost. As a bonus, the elapsed time to get new information entered into 
the system; is dramatically reduced. An even more significant bonus, 
however, is that by directly connecting these transcribing devices to a 
computer an excellent error control technique is provided* that is, 
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eranpxter detection of errors and the immediate feedback to the indexer. 
The advantages of quick feedback to the indexer are two: 

(a) The indexer in presumably still working on the earns 
document When the feedback is received and has the 
pertinent data fresh in his mind. Since re-reading 
of the complete document is not necessary, an extreme- 
ly time-con arming task can be eliminated. 

(b) The indexer, perhaps second only to the author, is 
the perse® best qualified to rectify an error in index- 
ing for this will often require understanding of the 
subject area as well as the context of the document . 

This procedure applies, of course, only to machine 
detectable errors. These errors are: spelling (e.g», use of a diction- 
ary can detect when an unknown word is used), illegal classification 
codes, illegal combination of tags sad/or keywords, format errors, 
and possibly unusual combination of subject classes and keywords which 
a computer might reject for further checking. Actual content review 
most he performed by humans, as now, either by full review of all 
indexes by supervisors or spot checking. 

Where machine readable input is available, such as Tele- 
type tape, measures can be takas to further speed the index transcrip- 
tion process, in this situation the indexer could be relieved of 
the necessity to transcribe words and ihrasee, instead having only 
to identify the word or phrase to be selected. Presuming a copy of 
the text will be made available to the computer, such identification 
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may be performed bar entering the word number rather than the actual 
word Into the computer. Set gain should be about half the tine the 
indexer would normally spend In actual transcription. 

4.1.2 Querying 

Again, because of the large volume of input and number 
of documents and records kept in storage, a means mat be provided 
to give users good quality responses to their queries without taking 
too much time. The method chosen is to provide for conversational 
querying *« a technique whereby the requester can scan files, deter- 
mine what is in the files related to his query, and, if desired, 
change the question again. Reasons for changing the question might 
be i insufficient data on a given subject dictates use of a broader 
question; too much data requires a narrower question; or a completely 
different subject approach might be suggested by file records retrieved 
in response to the original query. 

It is important to realise that conversational querying 
not only gives an analyst the positive advantage of improving Ms 
fusry rapidly, but also prevents the degradation of system performance 
that results when users ask overly broad questions to "play safe". 

There is so need for such action under this concept. If the user is 
dissatlaf led with his first results he has only to try again— he is 
not subject to a waiting period of several days to do this. 

The purposes, to repeat, of conversational querying are 
to permit rapid resolution of ambiguities, aske associations between 
subjects and index terms, and eliminate redundant information. This 
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presses can be acc omp lis he d to some extent either by the requester 
tip<m retrieval or by the indexer. Bmt is, given the same capability 
to nate rapid queries to the fils, the indexer can resolve ambiguities 
and anise associations between terms. Shis can result in more accurate, 
mere easpact files* hence, better retrieval. Vfaetfaer the process Is 
done at the retrieval end of the cycle or the input end or both, the 
me cha ni c a l capibility to make rapid queries to the file for the 
purpose of improving the quality of a retrieval (either directly or 
by improving the index) is essential to a system so large and diverse 
in coverage. H3ae alternative is necessarily too many records or 
doCTsaents with the consequence that the analysts do not get what they 
need from the system and then begin to ignore it. 

4.1,3 3^pta of Equipment 
4.1. 3.1 Index Transcription 

The m i ni m u m requirement for this class of equipment is 
an electric typewriter, that is, one with the standard typewriter 
keyboard for entry of words and numerical tags. The typewriter must 
produce seme form of recordable signal, such as a punched paper tape, 
msgoetic tape, or a signal ihlch is transmitted directly to a computer 
for storage there. The more elaborate devices give more service. 

Iter example, a magnetic-tape producing machine exists which simplifies 
correction of the tape at the local station. Street connection to 
the ccnputer gives quick service in the form of machine detection 
of errors and rapid feedback to the indexer. In so etc cases a 
eoahimtioa of two typewriters might be desired in order to permit 
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the user to ask question* on one, receive an answer on the other, 
without breaking up the visual image of hi* text on the input machine. 

In other words, the indexer way choose to query a system for a given 
none and may not wish to disrupt the clem copy of his index with 
the listing of all information about that name. 

4.1 * 3 . 2 Machine Bendable Input 

for machine readable copy, a conventional keyboard 
could be used but possibly a special, simplified keyboard would be 
better. She point to consider le that when machine readable copy *». 
available, a great deal of transcription time can be saved by entering 
only word numbers or other word identifications rather than recopying 
the entire word. It would be desirable to permit the indexer to 
hare one hand free to assist him in following the copy while using 
the other hand to operate the input keyboard. Thus, a simplified 
keyboard would be desirable. 

4.1.3. 3 Ctewsrsational (Querying and Indexing 

the first Impression, when considering equipment for 
this task, is a cathode -ray tube console with multiple switches sad 
possibly a light gun. This would present information to be displayed 
rapidly. Decisions could be made and then communicated rapidly. The 
excessive cost of such equipment precl ude s further consideration. 

*pim> same equipment proposed for indexing with a direct 
connection to a computer could do the job of conversational querying 
very well. That is, an electric typewriter keyboard with a direct 
connection and possibly some special keys to Indicate special functions 
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SfelB system le slower than a cathode -ray tube, having a asacimum of 
about 1,000 characters per miaute against several thousand per second 
on a cathode-ray tube. However, the logical functions are the same 
aad the typewriter has the advantage of providing copy much easier 
to read, ©art is, there is no flicker problem and the hard copy le 
preserved so that the analyst may retain several pages at his desk. 

Cost considerations are overwheOningly in favor of the typewriter 
system, the cost ratio being on the order of ten to one, cathode -ray 
tube over typewriter. 

4, 1.3. 4 Equipment Evaluation Problem 

©w overall system would benefit by having essentially 
the seme device in use for all remote input functions whether query 
or Indexing. Requirements in common are: typewriter keyboard, special 
function switches, optional ability to connect directly to a computer, 
optional possibility of linking two systems together for use by one 
person. 

la addition, system designers may consider these devices 
as communications terminals, that is, as elements of the ceasouai cation 
system. Buffering, then, is a consideration in selection of 
equipment. Use of the transcriber as a buffer will be considered in 
the next section, on communications , 

4.2, Qcaaauai cations 

4.2.1. ©*e Ccmmunieation Problem 

Although Hits system is being designed to operate within 
a large building there is a caaaunlcation problem in bringing data to 
the computer system and back again to the users. 
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la the presently operating systems, data is band carried 
from roan to roan. In this system, much of the information flow 
will he electrical in order to meet possible requirements for on -line 
indexing and quick-response, conversational querying. 

She elements required are I/O terminals, largely being 
modified electric typewriters j lines, here probably telephone or 
teletype; and multiplexers for switching and buffering. Buffering in 
such a system accomplishes two purposes. First, it holds information 
while awaiting processing. For example, if the computer is busy 
when a query arrives, it can be stored in a buffer until the computer 
is ready to take it. She second purpose is to store data temporarily 
for possible local processing before passing on to another communication 
point. The prime example of this is holding data at an Indexer 
station until he is sure it is error free or to pe*»it a second 
pass at the index record for adding additional data. 

She problems faced in the design of this system are 
selection of equipment configurations to perform these basic fimcticns 
and deciding where in the system each function will be performed. 

4.2.2 Terminnle 

The l/o Terminals have been described under Input. Sere, 
it should be pointed out that these devices can also act as buffers. 
There are three general classifications of ter mina ls as buffers: 
no terminal buffering as in the case of a direct input device such 
as teletypewriter; paper tape buffering such as a Flexowriter gives 
to provide input sequencing flexibility and thus prevent ccranuaicatioa 
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system overloading; magnetic tape buffering as in the message composer 
which gives cxxnsunication flexibility and also permits rapid recall 
of records for error and correct ion. 

Share are other kinds of terminals also. In many systems 
there must be a device between the transcriber and the line, these 
generally perform some signal conversion function. Examples are the 
IBM 1053, or the AT and I Eatapbane. Still another class of terminals 
is the security terminal Which safeguards the signal. This device can 
affect the choice of the others. 

4,2.3. Lines 

3he most likely choices here are telephone or Teletype 
lines, for security reasons, the latter seams the more appropriate 
for this project. No major implementation problems are expected in 
this area. 

4,2.4 Multiplexers 

These vary considerably in capability from simple 
switching to switching, buffering, and stored-program processing of 
data, including editing and priority determination. The simple 
switching multiplexers are virtually the same as terminals j the 
more elaborate ones are full-fledged computers. 

Problems in Choosing multiplexers mainly revolve around 
ifeere to pr ovide buffering capability, priority or overload procedures, 
and complexity of the conaunlcation problem presented (e.g,, 
convarsaticsml querying and indexing is a much bigger communication 
problem because of the large number of duplex channels needed, ) 
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k,2,3 Data Control 

Bfcile set strictly speaking a ocamsai cation problem, this 
is an appropriate place to introduce the concept of a Data Control Group. 
Sts function would be control of files, especially new entries and 
(Ganges t© Authority and Sfcct Files, File maintenance is a complex 
and important process not only mechanically Inst also with respect to 
intelligence production. It say well be the decision of management 
that individual analysts may not make file modifications without 
proper approval and at first this requirement appears incompatible 
with the concept of high speed processing of data. 

3Sse general needs fear a data control group would be 
the following: to have all prospective file changes routed to it 
before posting but not necessarily before actual entry into the 
eempu&er and to have all files available for checking specific 
items or for general surveillance . In the sense of these requirements , 
the control group is simply another analytic group needing the t"»» 
kind of access and cop mm lca t laas with the computer as substantive 
analysts and indexers . 

Bence, the concept of an analyst wishing to make direct 
postings to an information file and management wishing to exercise 
ti g ht quality control over file entries are by no means incompatible . 
k.3 Central computer complex 

fc.3.1 Ckaapufcer 

A high speed central processor of a type well within the 
current state of the art is definitely indicated. Ihe computer must 
be capable of input-output over many channels in an overlap mode. 
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Processing speed should he high, hut performance of the inquired 
searches in the required times should not present any major equipment 


4.3.2 Memory— 

Memory capacity requirements implied hy the system 
performance specifications are indicated in Figure 4-1. The three 
curve# represent the fastest, slowest, and xaediua system implementation 
schedules. The capacity requirement presents a problem to the system 
engineering task. Random, access devices such as disc storage devices 
are required. Several such devices are being considered which will 
satisfy the minimum system load requirements . 

4.3.3 Programs 

The possibility exists that the central computer complex 
will be time shared by the information retrieval system and other 
eosputer applications. Sven without this possibility, the complexity 
of the information retrieval system itself necessitates the development 
of a control program. All of the functions of this control program 
cannot be given at this time. The fils processing programs must 
be not only generalized but efficient. Che of these is usually gained 
only at the expense of the other, but the proper balance must be 
attained for this system, Generalization is imperative is order to 
perform such requirements as cyclic queries and file interaction 
queries. In order to meet the timing specifications of the Bysteal, 
the programs must operate quite efficiently. Input -output programs 
must be written to process the traffic to and from the various on-line 
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stations . Process internet capability and priority famrii ^ ng capability 
a re needed. 

4.4 Document storage 

4.4.1 Capacity 

A s imil a r graph to the me Illustrating memory capacity 
requirements is shown in Figure 4*2 for document storage requirements. 
Again, the three curves represent the fastest, slowest, and medium 
•y»t«a iBg»le»entatiou schedules, 

4.4.2 Search times 

carder to satisfy the requirement for 15 minute document 
retrieval responses the document hand! lug must be at least in part 
mechanized. Out of the 15 minutes, it is assumed that 10-12 minutes 
are available for docmanfc retrieval, 

4.4.3 Document Form 

Dias t© the sheer bulk of the document store for the system 
it is desirable to store microimages of the original hard copy document s. 
Hhe actual retrieved item my be the microimage, a paper copy 
reproduced from the micro imag e, or a paper copy reproduced from the 
ori ginal document. 
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